Some useful SAS String Data Handling Functions
--------------------------------
CAT, CATS, CATT, CATX
The CAT functions concatenate and return the values passed. CAT concatenates values as if the concatenation operator, ||, were used. CATT removes trailing blanks from each argument before concatenating (think CAT plus TRIM). CATS strips leading and trailing blanks before concatenating. CATX is like CATS, but places a delimiter between values being concatenated. Numeric values are formatted to character values using BEST32. for numbers with many digits and BEST12. for numbers with fewer digits. These functions reduce the need to use the TRIM, LEFT, and PUT functions with the concatenation operator.data _null_; length a b c $ 8; a = 'some'; b = ' text'; n = 123.456; c = 'together'; cop = trim(a) || trim(left(b)) || trim(left(vvalue(n))) || c; cat = cat(a, b, n, c); catt = catt(a, b, n, c); cats = cats(a, b, n, c); catx = catx(',', a, b, n, c); put +1 cop= / +1 cat= / catt= / cats= / catx=; run;Output:cop=sometext123.456together cat=some text 123.456together catt=some text123.456together cats=sometext123.456together catx=some,text,123.456,together ------------------------------------------------
SCAN
The SCAN function. Scan allows you to isolate a word ( or group of letters ) in a text string given a definable set of delimitersdata _null_; array word(4) word1-word4; string='Here is my string'; do i=1 to 4; word(i)=scan( string, i, ' ' ); put word(i); end; run;Output: word(1) = 'Here' word(2) = 'is' word(3) = 'my' word(4) = 'string'
SUBSTR, SUBSTRN, LENGTH, LENGTHN
The SUBSTRN function is like the SUBSTR function, but the length parameter can be zero. When the length parameter is zero, a zero length value is returned. SUBSTRN can be useful with regexp matches that may be zero length. The LENGTHN function is like the LENGTH function, but with a value that is all blanks, zero is returned. In this example, notice the NOTE output when SUBSTR is passed a zero length. SUBSTRN does not cause a NOTE like this to be output.data _null_; length empty $ 8; substr = substr("some text", 1, 0); substrn = substrn("some text", 1, 0); put substr= / substrn=; length = length(empty); lengthn = lengthn(empty); put length= / lengthn=; run;Output: NOTE: Invalid third argument to function SUBSTR at line XX column XX. substr=some text substrn= length=1 lengthn=0-------------------------------------------------
VVALUE, VVALUEX
VVALUE takes a variable and returns the value formatted as a character string. VVALUEX takes a string that contains a variable name and returns the value of the variable formatted as a character value. This is like using the PUT function without having to specify a format. The format used to format values is the default format for that type. For numeric values, the default format is typically BEST12. For character values, the default format is typically $w., where w is the length of the value. The default format can be changed with the FORMAT statement.data _null_; n = 123.456; a = vvalue(n); b = vvaluex('n'); put "a = " a $12. / "b = " b $12.; run;Output: a = 123.456 b = 123.456-----------------------------------------------------
References: 1. Jason Secosky, SAS, Cary, NC., The DATA step in SAS 9: What's New?, Revised May 3, 2004, http://support.sas.com/rnd/base/datastep/dsv9-sugi-v3.pdf
Learn More : ftp://ftp.sas.com/techsup/download/sample/datastep/concat_function.html